In October 2022, the City of Philadelphia introduced a pilot program that tested 20 paid, curbside loading spaces for delivery drivers in Center City, known as “smart loading zones.”
A conventional loading zone is a dedicated space on the street within the parking lane that gives vehicles space to conduct loading and unloading activities. Parking is not permitted in a loading zone, and there is usually a fixed time limit for its use by a given vehicle. The City posts rules for using a loading zone on signages next to the loading space and expects the users to read the signages before using the space.
Like a conventional loading zone, a Smart Loading Zone is also a dedicated space on the street to conduct loading and unloading activities. However, the availability, and regulations for Smart Loading Zones are digitally codified and the zones are bookable through the Pebble Driver app. This digitization of physical space will allow delivery companies to reserve spaces and pay for only the time they use. Drivers and delivery companies were able to reserve spaces and times through a smartphone app.
This pilot was conducted for six months from October 2022 to April 2023.
The pilot generated data about individual vehicles parking in each zone. The increase in home delivery and on-demand logistics has created a need for new tools to decongest the right-of-way. Our project explores the potential of opening new smart loading zones in the city. Using pilot data, we created a predictive model that can be used to estimate demand at new locations.
Through an iterative process, our team has meticulously crafted a high-performing model leveraging a wealth of data provided by the client regarding bookings made through the Smart Loading Zones app. Additionally, we have integrated external data sources such as OpenStreetMap and census data to enrich our model’s insights and accuracy. This iterative approach has allowed us to continuously refine and enhance our model, ensuring that it effectively predicts and optimizes Smart Loading Zone demand.
Understanding the demand for Smart Loading Zones involves considering various factors that influence a driver’s need to stop. Firstly, it’s essential to assess the distance to the nearest location of various land uses, as this can indicate potential loading or unloading requirements. Secondly, analyzing the volume and purpose of vehicles using the road provides insight into the overall traffic flow and potential demand for loading zones. Official road classifications offer additional context regarding road infrastructure and usage patterns. Moreover, considering factors such as rush hour versus off-peak times helps in identifying peak demand periods. By segmenting bookings into different times of the day, the model can better anticipate and accommodate varying demand levels, optimizing the utilization of Smart Loading Zones effectively.
The booking data encompasses both the quantity of reservations facilitated through the application and instances of violations, where drivers utilize the space without prior booking. On the other hand, curb data includes a comprehensive inventory of point and linear assets positioned along each curb, ranging from trees to stop signs, contributing to the overall streetscape and functionality. However, it’s crucial to acknowledge the underrepresentation apparent within the data, reflecting a significant portion of users who opt for conventional methods rather than utilizing the application. This tendency to adhere to traditional practices highlights the necessity for further engagement efforts to encourage broader adoption and participation in the Smart Loading Zone ecosystem.
The charts provide a clear visualization of the data, highlighting a notably high number of violations. Furthermore, within the booking events, the predominant dwell time category is 20 minutes.
This chart indicates some curb zones with high vehicle counts in the ‘16-30’ and ‘31-45’ minute segments. There are also a noticeable number of violations in certain zones along Chestnut st and Sansom st. Most curb zones have a lot of violations, with some zones along chestnut street showing a high number of ‘not_authorized’ and ‘overstayed’ violations. It further shows that they are hotspots for both high vehicle turnover and parking violations, particularly for not being authorized and overstaying the allotted time.
When visualized through plotting, it becomes evident that the majority of violations are unauthorized. A comparison of time segments with and without violations reveals an interesting pattern: Walnut Street and Sansom Street typically exhibit longer dwell times of 46-60 minutes, whereas Chestnut Street most commonly records dwell times of 16-30 minutes.
The event data further shows the distribution of events, dwell time across curb zones. As can see here, there are certain curbs which have significantly higher events, which are mainly on Chestnut Street and Sansom Street, both of them have longer operating hours, with chestnut st operating 7/24, Sansom Street weekdays 6am to 4pm. While Walnut Street only operates from 6am to 10 am on weekdays.
By analyzing the event data over time, we observe fluctuating patterns in curb zone usage from October 2022 to April 2023, without a clear overall increasing or decreasing trend. Weekdays exhibit a higher frequency of loading events compared to weekends, with the most significant activity occurring during PM rush hours. This is followed by a notable amount of events in the overnight and Mid Day period. AM rush activities are the least frequent. The comparison between weekdays and weekends shows that the PM rush & Mid day hour peak is a distinct feature of the weekday pattern, whereas weekend events are more evenly distributed throughout the day, albeit at a lower volume.
The parking duration chart shows common parking times at 20, 80, and 120 minutes, with 20 minutes being the most frequent. The vehicle type chart indicates that cars are the most common, with far fewer events for trucks, vans, and freight vehicles.
The total bookings by curb are presented below. Notably, 1000 Chestnut St and 1200 Sansom Street register significantly higher bookings, likely attributable to their extended operating hours throughout the week. Additionally, there is a clear trend where the segments along 8-10th St and 12-13th St experience higher event frequencies within each horizontal street.
To conduct the road network analysis, we got the data from the complete street dataset from Open Data Philly. Then, we filtered the dataset by our streets of interests, Chestnut, Sansom, and Walnut street, and include their characteristics for further analysis.
The bar plot shows the number of booking events on each of the streets. It shows that Chestnut street has a higher activity compared to Sansom and Walnut in terms of bookings. The reason is due to the fact that the operating hours of the loading zones on Chestnut Street is longer than the other two streets. A boxplot shows the distribution of booking events throughout the day, showing peak booking hours for each street.
In this chart, events are summarized by day of the week for each street to identify daily traffic patterns. It shows that the Chestnut Street has the highest booking events than the other two streets throughout the week, and has highest booking events on Thursday.
The interactive map shows the booking locations and bike networks around. We can tell that most of the curb size loading zones are located on bike networks, which leads to potential conflicts between cyclists and vehicles and impacts on the traffic flow. This is something planners should take into consideration when building new curb zones.
This bar chart visualizes the number of booking events on bike network streets, highlighted by whether they occur on the bike network.
The “amenity” types we collected were: “bar”, “cafe”, “fast_food”, “pub”, “restaurant”, “college”, “school”, “university”, “parking_space”, “bank”, “atm”, “clinic”, “hospital”, “pharmacy”, “community_centre”, “conference_centre”, “nightclub”, “theatre”, “police”, “post_box”, “post_office”, “place_of_worship”
The “building” types we collected were: “apartments”, “dormitory”, “hotel”, “commercial”, “office”, “retail”, “supermarket”, “warehouse”, “church”, “college”, “government”, “hospital”, “public”, “school”, “university”
The “shop” types we collected were: “alcohol”, “bakery”, “beverages”, “coffee”, “convenience”, “deli”, “department_store”, “general”, “supermarket”, “clothes”, “gift”
We also collected office building types and land uses but these data were not a large enough sample.
Assuming that distance to some certain types of buildings relates to how much cars and trucks need to use loading zones there, we calculated the distance from each pilot loading zone to the three nearest of each of these types.
Many of the strongest relationships are negative ones. In particular, decreasing correlation with the number of parking events at a loading zone come with distance from a place of worship.
The positive relationships are associated with clinics and community centers.
One of the queries from OSM was from their dataset of types of buildings. This interactive map allows you to compare where the pilot zones are in relation to various types of buildings. For example, you can see large clusters of hospitals and universities in University City and “commercial” in the office district west of City Hall.
Figuring out the type of street that each pilot zone on is important to understanding the type of traffic that uses that street.
Calculating the distance to each type of road classification can be a proxy for calculating full network relationships. We calculated the distance from each curb to each type of street, from highways all the way down to alleys.
With more than 250 unique variables collected from OSM, we streamlined the data into 11 simpler categories. This will be beneficial in eliminating the vagaries that come from some previous variables having very small groups, being very similar definitions to other variables, etc.
Simplified Panel with time, road network, and nearest neighbors variables.
By adjusting the hyperparameters, the optimal combination is as follows: mtry=17, nodesize = 22, ntree = 1000, maxnodes = 20
The chart visualizes the relative importance of the top 20 variables in a predictive model. Notably, the ‘week’ variable stands out as the most significant predictor, followed by the day variable, and the distance to class 4 and class 2 roads, indicating their substantial influence on the model’s output. Other proximity-related variables such as civic, retail, grocery, housing, dining, parking amenities, and schools are also identified as key factors, though to a lesser extent than week and road class variables.
The provided chart showcases the performance of a Random Forest model used for predicting bookings on a weekly basis. It indicates that the model tends to underperform particularly in extreme scenarios, such as when bookings are unusually high. The model shows a noticeable disparity in predictive accuracy for bookings that exceed an average of 1, often underestimating the actual values. Additionally, it is noteworthy that there are no bookings during the week of Christmas.
Apart from that, there’s an observable seasonal trend in bookings, which the model does not seem to fully capture. This limitation is likely due to the model being trained on a dataset limited to the months of October through April, which may not be representative of the entire year. This seasonal effect and the limited data scope could be contributing to the model’s reduced effectiveness during peak booking periods in summer.
This map provides a spatial overview of the model’s performance across different streets, where larger circles with more bright colors indicate a higher MAE at that curb, suggesting that the model’s predictions are less accurate on Chestnut St and Sansom St, while Walnut St has a relatively better performance.
From the chart, the high error in conjunction with high data counts suggest a systematic issue with the model especially on Tuesday and Thursday, rather than a random fluctuation. As such, there’s need to further check the weekly patterns or specific events impacting the model performance. While those low error points with low data counts, like Monday, demonstrate its good performance.
The prediction performance is more accurate during Monday and Friday, whereas it tends to have higher errors on middle of the week, especially on Tuesday.
Overall, the model seems to be more accurate and precise for lower values of bookings and shows some divergence from the observed data at higher booking values. The similarity of the trend lines in five panels suggests the model performs consistently across weekdays. However, the model doesn’t seem to be able to catch high values within the weekdays. Specifically, Monday and Friday show good prediction accuracy, others (particularly Wednesday) show significant prediction errors
While model is not good at capturing number of events larger than 1, it is still 75% accurate at predicting whether a curb will be occupied or not. The ROC curve here further illustrates its good performance. Specifically, the accuracy, sensitivity, and positive predictive value are relatively high, but the model struggles with specificity and negative predictive value, suggesting it’s better at predicting positives than negatives.
The cross-validation chart indicates that for a random forest model subjected to a 50-fold cross-validation process, the Goodness of Fit metrics—MAE, RMSE, and Rsquared—are closely clustered around their respective means. Specifically, the model’s predictions were off by over 0.47(MAE), whereas the RMSE’s central clustering suggests a similarly consistent average error magnitude but considering the squaring of the errors, which gives more weight to larger errors. The Rsquared values around 0.23 imply that the model explains only 23% of the variability in the data.It implies that the model is not capturing the complexity of the data.